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Abstract 

Let P be a set of n points in M 2 . Given a rectangle Q = [0(1,02] x [/?i,/?2], a range skyline 
query returns the maxima of the points in PnQ. An important variant is the so-called top- 
open queries, where Q is a 3-sided rectangle whose upper edge is grounded at y = 00 (that is, 
fi 2 = 00). These queries are crucial in numerous database applications. In internal memory, 
extensive research has been devoted to designing data structures that can answer such queries 
efficiently. In contrast, currently there is no clear understanding about their exact complexities 
in external memory. 

This paper presents several structures of linear size for answering the above queries with 
the optimal I/O cost. We show that a top-open query can be solved in 0(log B n + k/B) I/Os, 
where B is the block size and k is the number of points in the query result. The query cost 
can be made 0(loglog B U + k/B) when the data points lie in a U x U grid for some integer 
U > n, and further lowered to 0(1 + k/B) if U = 0(n). The same efficiency also applies to 3- 
sidcd queries where Q is a right-open rectangle. However, the hardness of the problem increases 
if Q is a left- or bottom- open 3-sided rectangle. We prove that any linear-size structure must 
perform il((n/B) e + k/B) I/Os to solve such a query in the worst case, where e > can be an 
arbitrarily small constant. In fact, left- and right-open queries are just as difficult as general 
(4-sidcd) queries, for which we give a linear-size structure with query time 0((n/B) e + k/B). 
Interestingly, this indicates that 4-sided range skyline queries have exactly the same hardness 
as 4-sided range reporting (where the goal is to report simply the whole PnQ). That is, the 
skyline requirement does not alter the problem difficulty at all. 



1 Introduction 



Let p and q be two different points in R 2 , where R denotes the real domain. We say that p 
dominates q if x p > x q and y p > y q , where x p and y p denote the x- and y-coordinates of p, 
respectively (similarly for x q and y q ). Given a set P of points in R 2 , a point p £ P is a maximum 
if it is not dominated by any point in P. The skyline of P contains all and only the maxima of P. 
Figure [T^i shows an example where P consists of all the points, and its skyline is the set of three 
black points. 

Given an axis-parallel rectangle Q, a range skyline query (also known as range maxima query) 
reports the skyline of ?fl Q, that is, the skyline of only the points of P covered by Q. In Figure[JJ), 
for instance, Q is the shaded rectangle, and the two black points constitute the query result. 
Depending on the shape of Q, range skyline queries have several variations. Specifically, when 
Q is a three-sided rectangle (i.e., an edge of Q is grounded on a boundary of the universe), a 
range skyline query becomes a top- open, right-open, bottom-open or left-open query, as depicted in 
Figures Ek-EH, respectively. When Q is a two-sided rectangle whose top-right (bottom-left) corner 
coincides with that of the universe, it is a dominance (anti- dominance) query, shown in Figured 
(J2F). Another well-studied variation is the contour query, where Q is a one-sided rectangle that is 
the half-plane to the left of a vertical line; see Figure Eg. 
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(a) Skyline (b) Range skyline 

Figure 1: Range skyline queries 



It is easy to observe some connections among these variations. First, top- and right-open queries 
are equivalent due to symmetry, and so are bottom- and left-open queries. However, top- (right-) 
open queries are not identical to bottom- (left-) queries, as we will prove later in this paper. In other 
words, the four types of three-sided queries are divided into two groups with distinct characteristics, 
which intuitively is because the skyline definition is not symmetric by all corners of the universe. 
Dominance, anti-dominance, and contour queries are special cases of at least one type of three-sided 
queries. 

The focus of this paper is how to preprocess the input P into a data structure, so that range 
skyline queries can be efficiently answered. This has been extensively studied in theoretical com- 
putational geometry [Til HI El 033 EHl [22], [231 EI] • In the database area, skylines have drawn 
very significant attention (see [31 EJ UM EH ESI E3 EH E21 [33] and the references therein) due to 
their crucial importance to multi-criteria optimization, which in turn is vital to a large number 
of applications. In particular, the rectangle of a range skyline query represents range predicates 
specified by a user. An effective index is essential for maximizing the efficiency of these queries in 
database systems [Ml ES] . 

Unless otherwise stated, we assume that the data universe is R 2 . In practice, each dimension 
often has a finite domain. Formally, given an integer U > 0, let [U] represent the set {0, 1, U — l}. 
All the above query variations remain well defined in the universe [U] 2 (i.e., a U x U grid). Set 
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(a) Top-open (b) Right-open (c) Bottom-open (d) Left-open 



(e) Dominance (f) Anti-dominance (g) Contour 
Figure 2: Variations of range skyline queries (black points represent the query results) 

n = \P\. An important degenerated case is U = 0(n), where the universe [O(n)] 2 is called the rank 
space. In general, for a smaller universe, it may be possible to achieve better query time under 
the same space budget. Finally, we consider that P is in general position, i.e., no two points in P 
have the same x- or y-coordinate. Input sets violating this condition can be easily dealt with by 
standard tie breaking methods. As a convention, when omitted, the base of a logarithm is assumed 
to be 2, i.e., logx = log 2 x. 

1.1 Computation Model 

Traditionally, the cost of an algorithm is measured as the amount of CPU time elapsed, as is natural 
when the input set can be accommodated in memory. Today, information is being accumulated at 
an unprecedented rate, which exceeds by far how fast the memory size of a commodity machine 
grows. Consequently, in many applications, the dataset cannot be contained in memory, but instead 
must be stored in the disk. Since calculation can happen only on data residing in (main) memory, 
an algorithm must perform many I/Os to move data from the disk into memory. As the speed ratio 
of memory to disk continuously escalates, the I/O time incurred by an algorithm dwarfs the CPU 
overhead to such an extent that the algorithm's cost depends almost entirely on how many I/Os 
are performed. 

Motivated by this, we investigate range skyline queries in the external memory (EM) model, 
which was proposed in pQ and has become the dominant computation model for studying I/O- 
efficient algorithms. In this model, a machine has M words of memory, and a disk of an unbounded 
size. The disk is formatted into disjoint blocks, each of which is formed by B consecutive words. 
An I/O loads a block of data from the disk to memory, or conversely, writes B words in memory 
to a disk block. The space of a structure equals the number of blocks it occupies, while the time 
of an algorithm equals the number of I/Os it performs. CPU time is for free. 

The value of M is no less than 2B, i.e., the memory can hold at least two blocks of data. If 
the input set requires 0(n) words to store, 0(n/B) is referred to as the linear cost because it takes 
this many I/Os to scan the input once. Moreover, logarithmic cost means 0(log B n), i.e., the base 
should be B instead of a constant. Finally, if the universe is [U] 2 where U is a finite integer, a 
machine word is assumed to have Q(\ogU) bits. This also means that a block has £l(BlogU) bits. 

1.2 Previous Results 

Range skyline in internal memory. We first review the existing results when the whole input 
set P fits in memory. Early research on this topic focused on dominance and contour queries, 



2 



both of which can be solved in 0(logn + k) time using a structure of 0(n) size, where k is the 
number of points reported [111 \13\ \19\ \22\ I27j . Brodal and Venkatesh [7J were the first to discover 
an optimal dynamic structure for top-open queries, which capture both dominance and contour 
queries as special cases. Their structure occupies 0(n) space, answers a query in 0(logn + k) 
time, and supports an insertion and deletion in O(logn) time. The above structures belong to 
the pointer machine model. Utilizing features of the RAM model, Brodal and Venkatesh [7j also 
presented an alternative structure in universe [U] 2 , which also uses 0(n) space, but answers a 
query in 0( ^°f^ n + k) time, and can be updated in O( lo 1 ° 1 ^ n ) time per insertion and deletion. 
It is worth mentioning that the static version of the problem can be easily settled in RAM using 
an RMQ (range minimum queries) structure (see, e.g., [39]), which uses 0(n) space and solves a 
top-open query in 0(1 + k) time. 

For general range skyline queries (with 4-sided rectangles), all the known structures demand 
super-linear space. Specifically, Brodal and Venkatesh [7J gave a pointer-machine structure of 
0(n log n) size, 0(log 2 n+k) query time, and 0(log 2 n) update time. Kalavagattu et al. [20] designed 
a static RAM-structure that occupies 0(n log n) space and achieves query time 0(logn + k). In 
the rank space [O(n)] 2 , Das et al. |12] proposed a static RAM-structure with 0(n ^°f^ n ) space 

and 0( , lo , gn — h k) query time. 

v log log n ' ^ J 

Although the above results also hold directly in external memory, they are far from being 
satisfactory. In particular, all of them incur £l(k) I/Os to report a query result of k points. An 
I/O-efficient structure ought to achieve 0(k/B) I/Os for the same purpose. 

Range skyline in external memory. In contrast to internal memory where there exist a large 
number of results, range skyline queries have not been well studied in external memory. As a naive 
solution, we can first scan the entire P to eliminate the points falling outside the query rectangle 
Q, and then find the skyline of the remaining points by the fastest skyline algorithm [33J on non- 
preprocessed input sets. This solution is very expensive, and can incur 0((n/B) log M / B (n/ B)) 
I/Os. Papadias et al. [29] described a branch-and-bound algorithm when the dataset is indexed 
by an R-tree [U [16] . The algorithm is heuristic in nature and cannot guarantee better query time 
than the naive solution mentioned earlier in the worst case. 

Very recently, Kejberg-Rasmussen et al. [23] designed the first I/O-efficient structure for top- 
open queries. For any e satisfying < e < 1, their structure occupies 0(n/ B 1_e ) space, answers 
a query in 0(log 2 B^ n + k/B 1 ~ e ) I/Os, and supports an update in 0(log 2B<! n) I/Os. For our 
discussion, the most interesting tradeoff is obtained with e = 0, in which case the structure uses 
linear space, has 0(log 2 n + k/B) query time and 0(log 2 n + k/B) update time. For e > 0, the 
structure requires more than linear space by a large factor B e , while the query cost is higher than 
the desired 0(\og B n + k/B) bound also by B e times for large k. No solution is known for 4-sided 
range skyline queries. 

Pointer-machine lower bound. An astute reader would have noticed that all the above results 
for three-sided queries focus exclusively on top-open (equivalently, right-open) queries. In par- 
ticular, no structure for left-open (equivalently, bottom-open) queries can match the space-query 
tradeoff of any top-open structure aforementioned. This is not accidental, at least not under the 
(internal memory) pointer machine, according to a lower bound result of Kejberg-Rasmussen et 
al. [23]. They constructed a set of hard anti-dominance queries over a low-discrepancy point set 
by Chazelle [8]. Combining this query set with a result of Chazelle and Liu [9], they proved that, 
for any structure to answer an anti-dominance query in 0(\og c n + k) time for any constant c > 0, 
the space consumption must be Q( n iog\<^ n )- Since anti-dominance queries are specialization of 
left-open queries, the same lower bound also applies to the latter, namely, no structure of 0(n) 
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size can solve a left-open query in 0(logri + k) time. In external memory, this implies that a 
pointer-machine structure of linear size cannot guarantee 0( B logn + k/B) query time. No lower 
bound, however, is known for I/O-efficient structures outside the pointer machine. 

Other related results in external memory. Range skyline queries stem from the marriage of 
skyline and range queries. Specifically, given an axis-parallel rectangle Q, a range query reports 
all the points of the dataset P that are covered by Q. These queries have been well understood in 
external memory. There exists a linear-size structure that answers a range query in 0((n/B) € +k/B) 
I/Os [151 EH [35] . As another interesting tradeoff, a structure of [2] uses 0(-g log 1 ° | ^ g n n ) space and 
solves a query in 0(log B n + k/B) I/Os. Both of the above tradeoffs are optimal, according to 
the lower bounds of [2[ [T71 [211 EH ES]- This means that one cannot hope to achieve query time 
0(\og B n + k/B) with a structure of linear size. 

Three-sided range queries are an important variant where Q is a three-sided rectangle (same as 
those in Figures [2^i-|2]l) . Note that since no corner-asymmetry (such as dominance) exits for range 
queries, the orientation of Q is irrelevant. Namely, no matter which side of Q is open, all three-sided 
range queries can be supported by an identical structure after rotation. Occupying linear space, 
the external priority search tree [2] is able to answer a three-sided query in 0{\og B n + k/B) I/Os. 

Most research in external memory makes the indivisibility assumption. Informally, this assump- 
tion requires that a data structure should store every point coordinate as an atom. We are not 
allowed, for example, to cut the bits of a coordinate into multiple pieces, and place them into differ- 
ent blocks; neither can we compress the coordinate to save space. A justification of the assumption 
is that it holds for most practical structures (e.g., B-trees, R-trees, kd-trees, etc.). Another more 
theoretical justification is that it facilitates the development of lower bounds. Indeed, in exter- 
nal memory, all existing lower bounds of range queries [21 [TTl [2TJ Ell EE] were derived with this 
assumption in minc0. 

Efforts have been made in recent years towards understanding 1/O-efficient algorithms without 
the indivisibility assumption [181 125} 130 1 137j. The general observation is that, when the universe is 
finite (e.g., [U] 2 for some integer U), sometimes it is possible to derive a space-query tradeoff better 
than the optimal tradeoff in an infinite universe like M? . As an example related to our work, Larsen 
and Pagh [25] shows that, in the rank space [O(n)] 2 , there is a linear-size structure that answers a 
3-sided range query in 0(1 4- k/B) I/Os, i.e., shaving off the additive factor term 0(\og B n) in the 
query time of the external priority search tree. 

1.3 Our Results 

It is clear from the above discussion that, currently there is no clear understanding about the 
exact complexities of range skyline queries in external memory. This paper addresses this issue by 
presenting a set of linear-size structures with optimal query efficiency, as elaborated below. 

First, for top-open queries, we give an elegant reduction that converts the problem to segment 
intersection. Specifically, in the latter problem, the input is a set S of horizontal segments in M 2 . 
Given a vertical segment q, a query reports all the segments of S intersecting q. The segment 
intersection problem can be settled by a persistent B-tree [28]. It immediately implies a linear-size 
structure that answers a top-open query optimally in 0(\og B n + k/B) I/Os (Theorem [T]) . The 
result applies to dominance and contour queries as well, since they are specialization of top-open 
queries. 

As a second step, we prove an interesting feature of our top-open structure: it can be constructed 
lr This is not completely true for [35], whose results hold under a weaker form of the assumption. 
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in 0(n/B) I/Os after the input set P has been sorted on x-coordinates (Theorem [T]) . We call this 
property sort-aware build-efficient (SABE), which in general says that a structure can be built in 
linear I/Os after the input elements have been arranged in a certain order. The importance of 
SABE is that, it singles out the most expensive step (i.e., sorting) from the rest of the construction. 
A non-trivial SABE example is the external priority search tree: Tao [36] showed how to utilize 
this property to reduce the cost of updating a structure where the external priority search tree is 
deployed as a secondary structure. We prove that our top-open structure is SABE by designing 
a linear-time algorithm for constructing a persistent B-tree. This is interesting because persistent 
B-trees are not SABE in general [38J. Our algorithm is assisted by several intrinsic properties of 
top-open queries that are established for the first time in this paper. 

The above structures are indivisible, namely, they obey the indivisibility assumption (this fact 
has the significance that the aforementioned results can be achieved without using the extra power 
allowed by breaking the assumption). Next, we improve the query time beyond the logarithmic 
bound when the data universe is small. Specifically, when the universe is [U] 2 where U is an 
integer, we give a divisible linear-size structure with query time 0(\oglog B U + k/B), which is 
optimal (Corollary [1]). In the rank space, the query time can be further reduced to 0(1 + k/B) 
(Theorem [2]). These results are based on a new divisible structure that answers a ray dragging 
query in 0(1) I/Os on a small point set P (Lemma 0j). Specifically, given a vertical ray p, a ray 
dragging query reports the first point in P hit by moving p leftwards. 

Our final contribution targets anti-dominance, bottom-open (hence, left-open), and the most 
general 4-sided queries. We prove that all of them actually have exactly the same hardness as 
far as indivisible linear-size structures are concerned. Specifically, any such structure must incur 
Q((n/B) e + k/B) I/Os answering a query in the worst case (Corollary [2}, where e > can be 
an arbitrarily small constant. Furthermore, this is tight because there is a linear-size structure 
matching this efficiency (Theorem H|). Recall that Q((n/B) e + k/B) is also the optimal query time 
of range queries under the linear space budget (see Section ll.2p . Therefore, interestingly, range 
skyline queries are just as hard as range queries, i.e., the skyline requirement does not change the 
problem difficulty at all. 

In all cases, our query algorithms report the points of a query result by the order of y-coordinates 
(equivalently, by the order of x-coordinates). This is not trivial because if the reporting order is 
not guaranteed, obtaining it requires sorting in 0(jj\og M / B -g) I/Os. 

2 SABE Top-open Structure 

This section will describe a structure of linear size that solves a top-open query in M? using 
0(log B n + k/B) I/Os. The structure is SABE, i.e., it can be built in 0(n/B) I/Os if the in- 
put set P has been sorted by x-coordinates. 

Throughout the paper, we will make frequent use of B-trees. Unless otherwise stated, a B-tree 
has both leaf capacity and internal fanout set to B. Specifically, the former gives the maximum 
number of elements that can be stored in a leaf node, whereas the latter specifies the maximum 
number of child nodes that an internal node can have. 

2.1 Reduction to Segment Intersection 

We first describe a simple linear-size structure with 0(log B n + k/B) query time. This is achieved 
with an elegant reduction that converts a top-open query to a query of segment intersection. 
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Figure 3: Reduction 



Let p be any point in P. Denote by leftdom(p) as the leftmost point among all the points in 
P dominating p. If such a point does not exist, leftdom(p) = NULL. We convert p to a horizontal 
segment a(p) as follows. Let q = leftdom(p). If q = NULL, then a(p) = [x p ,oo) x y p , that is, 
the horizontal segment whose x-span is interval [x p ,oo), and y-projection is y p . Otherwise (i.e., q 
exists), a(p) = [x p ,x q ) x y p . Define £(P) = {cr(p) \ p G P}, i.e., the set of segments converted from 
the points of P. 

Now, consider a top-open query with rectangle Q = [a±, 02] x [f3, 00). We answer it by performing 
segment intersection on X(P). First, obtain f3' as the highest y-coordinate of the points in PnQ. 
Then, report all segments in X(P) that intersect the vertical segment 02 x [/3,/3'j. 

To illustrate the reduction, Figure Eh. shows the segments in X(P), where P is the set of points 
in Figure QJi. For example, P2 = leftdom(pi), which is why a(pi) terminates at x P2 . The shaded 
rectangle represents the search area Q of a top-open query. In this case, the value of [3' equals y P2 . 
Hence, our algorithm creates the dashed vertical segment shown in Figure [3b, and finds all the 
segments of X(P) intersecting it (i.e., a{p2) and cr(ps)). It is easy to verify that the query result is 
{P2iP3}- The lemma below formally establishes the correctness of our algorithm. 

Lemma 1. Our algorithm correctly answers all top-open queries. 

Proof. Consider any point p E P. We show that our algorithm reports p if and only if p satisfies 
the top-open query with search area Q = [01,02] x [/3,oo). 

// direction: As p satisfies the query, we know that p £ Q, y p < j3' , and q = leftdom(p) ^ Q. 
The last fact suggests that x q > a.2 (in the special case q = NULL, define x q = 00). Hence, 
a(p) = [x p ,x q ) x y p intersects the vertical segment 02 x , and thus, will be reported by our 

algorithm. 

Only-if direction: Let p be a point found by our algorithm, i.e., a(p) = \x p ,x q ) x y p intersects 
02 x where q = leftdom(p). It follows that x p < a.2 < x q and [3 < y p < /?'. 

Next, we prove ot\ < x p . Recall that /3' is the y-coordinate of the highest point p' among all the 
points in PnQ. If p = p' , then a± < x p clearly holds. Otherwise, we know y p < y p >, which implies 
that x p > x p '. This is because if x p < x p /, then p' dominates p, which (because x p > < 02 < x q ) 
violates the definition of q. Now, x p > a\ follows from x p > > a±. 

So far we have shown that p is covered by Q. It remains to prove that p is not dominated by 
any point in PnQ. This is true because 02 < x q suggests that the leftmost point in P dominating 
p must be outside Q. □ 

Note that (3' can be easily found in 0(log B n) I/Os with a range-max query on a slightly 
augmented B-tree indexing the x-coordinates in P. To facilitate retrieving the segments intersecting 
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a.2 x \p, (3'], we store £(P) in a persistent B-tree [28]. As E(P) has n segments, the persistent B-tree 
occupies 0(n/B) space and answers a segment intersection query in 0(log B n + k/B) I/Os. We 
thus have obtained a linear-size top-open structure with 0(\og B n + k/B) query time. 

More efforts, however, are needed to make the structure SABE. In particular, we have to 
overcome two challenges. First, we must generate £(P) in linear I/Os. Second, the persistent B- 
tree on £(P) must be built in the same amount of time. We explain how to achieve these purposes 
in the subsequent sections. It is worth noting that the (augmented) B-tree for the computation of 
(3' can be easily built in linear I/Os after P is sorted by x-coordinates. 

2.2 Computing S(P) 

S(P) cannot be an arbitrary set of segments. Next, we reveal two properties about it, which are 
behind the correctness and efficiency of our algorithms. 

Lemma 2. £(P) has the following properties: 

• (Nesting) for any two segments s\ and S2 in S(P), their x-intervals are either disjoint, or 
such that one x-interval contains the other. 

• (Monotonic) let £ be any vertical line, and S(£) the set of segments in S(P) intersected by 
£. If we order the segments in S(£) bottom-up by y- coordinates, the lengths of their x-intervals 
increase monotonically. 

Proof. Nesting: Let p\ and p2 be the points such that si = <r(pi) and S2 = cr(p2)- Assume without 
loss of generality that x Pl < x P2 . Consider first the case y Pl < y P2 . In this scenario, the x-interval 
of si must terminate before x P2 because p2 dominates p\. In other words, s\ and S2 have disjoint 
x-intervals. 

We now discuss the case y pi > y P 2- If leftdom{p\) has x-coordinate smaller than x P2 , s± and s 2 
have disjoint x-intervals. Otherwise, leftdom(pi) also dominates p2, implying that the x-interval of 
S2 is enclosed in that of s\. 

Monotonic: Let £ intersect the x-axis at a. Consider the contour query with rectangle Q = 
(— oo, a] x (—00,00), which is a special top-open query. By Lemma [H the left endpoints of the 
segments in S(£) constitute the skyline of PCiQ. Therefore, if we enumerate the segments of S(£) 
in ascending order of y-coordinates, their left endpoints' x-coordinates decrease continuously. It 
thus follows from the nesting property that their x-intervals have increasing lengths. □ 

We are ready to present our algorithm for computing S(P), after P has been sorted by x- 
coordinates. Conceptually, we sweep a vertical line £ from x = —00 to 00. At any time, the 
algorithm (essentially) stores the set S{£) of segments in a stack (recall the definition of S(£) 
in Lemma [5]). More specifically, the segments of S(£) are en-stacked in descending order of y- 
coordinates (i.e., the highest segment enters the stack first). Whenever a segment is popped out of 
the stack, its right endpoint is decided, and output. In general, the segments of S(P) are output 
in ascending order of their right endpoints' x-coordinates. 

We now give the algorithm's details. It starts by pushing the first (i.e., leftmost) point of P into 
the stack. Iteratively, let p be the next point fetched from P. We check whether y p > y q , where 
q is the point currently at the top of the stack. If yes, we know that p = leftdom(q). Hence, the 
algorithm pops q out of the stack, and outputs segment o~(q) = [x q , x p ) x y q . Then, setting q to the 
point that tops the stack now, the algorithm checks again whether y p > y q , and repeats the above 
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steps if yes. This continues until either the stack is empty or y p < y q . In either case, the iteration 
finishes by pushing q into the stack. 

It is clear from the earlier discussion that the algorithm generates X(-P) in 0(n/B) I/Os. 
2.3 Constructing the Persistent B-tree 

Remember that we need a persistent B-tree T on Usually, the construction of a persistent 

B-tree requires super- linear I/Os even after sorting |38j . Below, we show that the two properties 
of in Lemma [2] allow building T in linear I/Os. 

Let us number the leaf level as level 0. In general, the parent of a level-z (i > 0) node is at level 
i + We will build T in a bottom-up manner, i.e., starting from the leaf level, then leave 1, and 
so on. We will first explain how to create the leaf nodes from T,(P). 

We will again apply plane sweep, for which purpose we need to sort the left and right endpoints 
(of the segments) in T,(P) together by their x-coordinates. This can be done in 0(n/B) I/Os as 
follows. First, P, which is sorted by x-coordinates, essentially gives a sorted list of the left endpoints 
in X(-P). On the other hand, our algorithm of the previous subsection generates X(-P) in ascending 
order of the right endpoints. By merging the two lists in linear time, we obtain the desired sorted 
list of left and right endpoints combined. 

Before elaborating our approach of building the persistent B-tree, let us briefly review the 
traditional algorithm proposed in |28j . The algorithm conceptually moves a vertical line £ from 
x = — oo to oo. At any moment, it maintains a B-tree T{£) on the y-coordinates of the segments 
in S(£) (recall that S(£) is the set of segments in S(P) intersecting £). We call T{£) the snapshot 
B-tree. To do so, whenever £ hits the left (right) endpoint of a segment s, it inserts (deletes) the 
y-coordinate of s in T(£). Deletions are logical, i.e., they simply mark the positions of t at which 
the corresponding elements are deleted, instead of physically discarding those elements. Overall, 
the persistent B-tree can be regarded as a space-efficient union of all the snapshot B-trees. 

The above algorithm incurs 0{n\og B n) I/Os because, intuitively, (i) there are 2n updates in 
total, and (ii) for each update, 0(log B n) I/Os are needed to locate the leaf node to be modified. 
It turns out that when S(P) is nesting and monotonic, the construction can be significantly accel- 
erated. The most crucial observation is that any update to S(£) happens only at the bottom of £. 
Specifically, whenever I hits the left /right endpoint of a segment s G S(P), s must be the lowest 
segment in S(£) (otherwise, either the nesting or monotonicity property is violated). This implies 
that the leaf node of T(£) to be altered must be the first one in T{£) (as it contains s). Hence, 
we can find this leaf without any I/O by buffering it in memory, in contrast to the 0(log B n) cost 
originally needed. 

The other details are standard. We sketch them below assuming that the reader is familiar 
with the algorithm of [28]. First, since we focus on only the leaf level, it is unnecessary to maintain 
any internal nodes. Whenever the first leaf u of T{£) is full, we version copy it to v! , and possibly 
perform a split on v! , or merge v! with its sibling (in this case, the sibling needs to be version copied 
first). The sibling can be found in one I/O by keeping a pointer to it in u. In general, such sibling 
pointers are created in node splits, and properly maintained during version copies and merges. By 
the standard analysis [28], a version copy, split, and merge can all be handled in 0(1) I/Os, and 
can happen only 0(n/B) times. Therefore, the cost of building the leaf level is 0{n/B). 

Now we explain how to build the nodes of level 1. This can in fact be achieved by exactly the 
same algorithm as described above, but on a different set of segments. To explain, we first review 
an intuitive way [28j to visualize a node a in a persistent B-tree. The node can be viewed as a 
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rectangle r(u) in M 2 . Specifically, the x-interval of r(u) starts (ends) at the position of t at which u 
is created (version copied). The y-interval of r{u) starts (resp., ends) at the smallest y-coordinate 
in u (resp., in the succeeding sibling of it) in the snapshot B-tree where u is created. 

Let Ei be the set of bottom edges of all r(u), where u ranges over all the leaf nodes of the 
persistent B-tree already obtained. Note, however, that if a bottom edge is [a;, x'\ x y, we include 
[x,x') x y into Ei, namely, the right endpoint of the edge is not taken. |Si| = 0(n/B), i.e., the 
number of leaf nodes. The next lemma points out a crucial fact. To enhance presentation, we have 
moved some relatively standard proofs (such as the one of the lemma below) to the appendix. 

Lemma 3. Si is both nesting and monotonia. 

Notice that our algorithm (for building the leaf nodes) writes the left and right endpoints of 
the segments in Si in ascending order of their x-coordinates, as is due to the plane sweep. This, 
together with Lemma[3j permits us to create the level- 1 nodes using the same algorithm in 0(n/ ' B 2 ) 
I/Os (recall that |Si| = 0(n/B)). We repeat the above process to construct the nodes of higher 
levels. The cost decreases by a factor of B every level up. The overall construction time is therefore 
0(n/B). We are now ready to prove our first main result: 

Theorem 1. There is an indivisible linear-size structure on n points in M 2 such that, a top-open 
range skyline query can be answered in 0(log B n + k/B) I/Os, where k is the number of reported 
points. The points of the query result are reported in the order of y- coordinates. If all points have 
been sorted by x-coordinates, the structure can be built in linear I/Os. The query time is optimal 
even without the indivisibility assumption. 

Proof. We first prove the part about report ordering. As discussed in Section \2. 11 we answer a top- 
open query by retrieving the segments of S(P) intersecting a vertical segment. Using the persistent 
B-tree, we can do so by listing those segments in ascending order of their y-coordinates [28]. This 
establishes the desired output ordering because the left endpoints of those segments constitute the 
result of the top-open query. 

Next, we discuss the query time's optimality. First, the k/B term is indispensable if k points 
need to be reported. The 0(\og B n) term, on the other hand, is also compulsory as can be shown by 
a reduction from predecessor search. Precisely speaking, the reduction is in fact from predecessor 
search to top-open range queries (note: not range skyline queries), which is well known in the 
literature (see, e.g., [6]). Specifically, if a linear-size structure can answer a top-open range query 
in f(n,B) + 0{k/B) time, the same structure also solves a predecessor query in f(n,B) time. 
Interestingly, given a predecessor query, the converted top-open range query always returns only 1 
point. Hence, the query can also be interpreted as a top-open range skyline query, i.e., the same 
reduction also works from predecessor search to top-open range skyline queries. Finally, any linear- 
size structure must incur 0,(\og B n) I/Os answering a predecessor query in the worst case [30]. It 
thus follows that Q(log B n) also lower bounds the cost of a top-open query. 

The rest of the theorem follows from the earlier discussion directly. □ 

3 Divisible Top-open Structure 

The structure of the preceding section does not divide any coordinate, i.e., Theorem Q] holds even 
under the computationally-weaker external memory model with the indivisibility assumption. This 
section eliminates the assumption, and unleashes the power endowed by bit manipulation. As we 
will see, when the universe is small, it leads to linear-size structures with faster query time than in 
Theorem [U 
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3.1 Ray Dragging 

We now take a short break from range skyline queries to discuss the ray dragging problem. The 
input is a set S of m points in [U] 2 where U > m is an integer. Given a vertical ray p = a x \J3,U] 
where a, ft € [U], a ray dragging query reports the first point in S that is hit by p when p moves 
towards left. We want to store S in a structure to answer all queries efficiently. We will prove: 

Lemma 4. For m = (B log U)°^ l \ we can store S in a structure of size 0(m/B) such that a ray 
dragging query can be answered in 0(1) I/Os. 

Recall that a machine word has f2(log[/) bits, and hence a block has Q(BlogU) bits. The 
remainder of this subsection serves as the proof of Lemma HI We will adopt the framework of |25] 
that was used to design a structure for 3-sided range queries. Nonetheless, as will be clear shortly, 
several ideas specific to ray dragging are required to obtain our result. 

A structure with query time 0(log B m). For this purpose, we simply store S in a B-tree 
that indexes the x-coordinates of the points in S. Let u be an internal node whose child nodes are 
vi, ...,vb- For each i E [1,-B], we store with u a point Y max (vi), where in general Y max (v) gives the 
highest point in the subtree of v. Define Y^^u) = {Y max (vi) | 1 < i < B}. Furthermore, for a 
leaf node z, define Y^ iax (z) to be the set of points stored in z. 

We answer a ray-dragging query with ray p = a x \J3, U] as follows. First, descend a root-to-leaf 
path -/r to the leaf node containing the predecessor of a among the x-coordinates (of the points) in 
S. Let u be the lowest node on ir such that Y£ ax (u) has a point that can be hit by p when p moves 
left. Note that whether Y^^iv) includes such a point can be checked in 0(1) I/Os by loading 
Y^ iax (u) into memory. Hence, u can be identified in 0(h) I/Os where h is the height of the B-tree. 
If u does not exist, we return an empty result (i.e., p does not hit any point no matter how far it 
moves) . 

For the case where u exists, let p be the first point in Y^ ax (u) hit by the left-moving p. Suppose 
that p is in the subtree of v, where v is a child node of u. The query result must be in the subtree 
of v, although it may not necessarily be p. To find out, we descend another path from v to a 
leaf. Specifically, we reset u to v, and find the first point p in Y^ ax (u) (= Y* nax (v)) that is hit by 
the left-moving p (notice that p has changed). Now, letting v be the child node of u from whose 
subtree p is from, we repeat the above steps. This continues until u becomes a leaf, in which case 
the algorithm returns p as the final answer. 

It is easy to see that the query time is 0(h) = 0(log B m). We will refer to the above structure 
as a ray-drag B-tree. Note that if B > sjlog U, 0(log B m) = 0(log B (B\ogU)) = O(l), which 
fulfills the purpose of Lemma [H We therefore focus on B < vlog U in the sequel. 

Minute structure. Set b = BlogU. As B < ^log U, b < log 3 / 2 U. We now consider the case 
where S has very few points, or specifically, m < \fb < log 3 / 4 U. 

We convert S into a set S' of points in a [m] 2 grid. For this purpose, we map a point p G S 
to p' G S' such that p' x (p' y ) is the number of points in S whose x- (y-) coordinates are at most p x 
(p y ). That is, p' x (p' y ) is the rank of p x (p y ) among the x- (y-) coordinates in S. 

Given a ray p = a x [/3,oo), we instead answer a query in [m] 2 using a ray p' = a' x [(3',oo) 
obtained from p. Specifically, a' (/?') is the rank of the predecessor of a (/3) among the x- (y-) 
coordinates in S. We create a fusion tree |14] on the x- (y-) coordinates in S so that the predecessor 
of a ((3) can be found in 0(log fe m) = 0(1) I/Os (see also [25]), which is thus also the cost of turning 
p into p' . The fusion tree uses 0(m/B) blocks. 
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We will ensure that the query with p' (in [m] 2 ) returns an id from 1 to m that uniquely identifies 
a point p in S, if the result is non-empty. To convert the id into the coordinates of p, we store the 
points of S in an array such that any point can be retrieved in one I/O by its id. Clearly, the array 
occupies 0(m/B) blocks. 

Next, we explain how to store S' . The benefit of working with S' is that each coordinate in 
[m] 2 requires fewer bits to represent than one in [U] 2 , i.e., log 2 m bits as opposed to log 2 U. In 
particular, we need 31og 2 m bits in total to represent a point's x-, y-coordinates, and id. Since 
|5'| = m, the storage of the entire S' demands 3mlogm = 0(log 3 / 4 U • log log U) = o(\ogU) bits. 
In other words, we can store the entire S' in a single word! Given a query with p', we simply load 
this word into memory, and answer the query in memory with no more I/O. 

From the above description, we have obtained a structure of 0(m/B) blocks with constant 
query time when m < Vb. 

Proving Lemma [4j We are ready to discuss the case where m = b olyl \ as needed in Lemma [H 
In this case, we create a ray-drag B-tree on S as described earlier, but setting its internal fanout 
to b (the leaf capacity is still B). The height of the tree is therefore h = O(log^m) = O(l). Recall 
that each internal node u of the ray-drag B-tree is associated with a set Yj^ ax (u) of size b. We can 
no longer store Y^ ax (u) in a single block because b may be greater than B. Instead, we create a 
minute structure on Y^ ax (tt)> which needs 0(1 + 6/ 'B) space. Since there are 0(mm{n/B, n/(bB)}) 
internal nodes, all the minute structures occupy 0(n/B) blocks in total. 

A query with ray p can still be answered by the same algorithm of the ray-drag B-tree discussed 
before. The only difference is that, at an internal node u, we search the minute structure on Ymaxi u ) 
to find the first point p hit by the left-moving p. As this requires only O(l) I/Os, the total query 
cost is 0(h) = 0(1). There is a technical detail that requires a bit of clarification. Recall that, 
regarding p, we need to know which child node of u contains p in the subtree. This can be achieved 
by associating p with the child node's index in u. The index requires only log 2 b < log 2 U bits, 
which is no more than the length of a coordinate. Hence, we can easily store the index along with 
p in the minute structure on Y^ ax (u) . We thus complete the proof of Lemma HI 

3.2 Top-open Structure on Few Points 

We now resume our study of top-open queries. Remember that the input is a set P of n points 
in [U] 2 for some integer U > n, and a query is given a rectangle Q = [«i,a 2 ] x [f3,U] where 
ai,a 2 ,/3 G [U\. We will present a structure for small P, specifically: 

Lemma 5. Forn < (BlogU) ^, we can store P in a linear-size structure that answers a top-open 
range skyline query in 0(1 + k/B) I/Os, where k is the number of reported points. Furthermore, 
the points are reported in the order of y-coordinates. 

Proof. Consider a query with Q = [ai,a 2 ] x [/3,U]. Let p be the first point hit by the ray p = 
a 2 x [f3, U] when p moves left. If p does not exist or is out of Q (i.e., p x < ai), the top-open query 
has an empty result. Otherwise (i.e., p € Q), p must be the lowest point in the skyline of PCiQ. 

The subsequent discussion focuses on the scenario where p € Q. We index S(P) with a persistent 
B-tree T, as in Theorem [0 Recall that the top-open query can be solved by retrieving the set S 
of segments in S(P) intersecting the vertical segment ip = a 2 x [/?,/?'], where /3' is the highest 
y-coordinate of the points in P(~)Q. To do so in 0(1 + k/B) I/Os, we need some observations: 

1. S includes exactly the segments ofT,(P) intersecting the vertical segment tp' = p x x \p y ,(5'}. 



11 



We prove this in two steps. First, notice that cr(p) is the lowest among the segments of 
S(P) intersecting ip (recall that cr(p) is the segment in converted from p). Hence, a 

segment of X(-P) intersects ij) if and only if it intersects cti x [p y ,f3']. Second, a segment of 
X(-P) intersects «2 X \Py,P'] if and only if it intersects ip'. This is because of the nesting and 
monotonicity properties of S(P). Specifically, let s ^ o~(p) be a segment in intersecting 
«2 x [py , /?'] - As s is higher than o~(p), the x-interval of s must contain that of o~(p), implying 
that s intersects ip' . In the same manner, one can show that if s intersects tp', it also intersects 
«2 x \p y> /3']. 

2. Let T{t) be the snapshot B-tree in T when I is at the position x = p x . Once we have obtained 
the leaf node in T(£) containing y p , we can retrieve S in 0(1 + k/B) I/Os without knowing 
the value of /?' . 

Each leaf node in T{1) has a sibling pointer to its succeeding leaf node. Hence, starting from 
the leaf storing y p , we can visit the leaves of T(£) in ascending order of the y-coordinates they 
contain. The effect is to report in the bottom- up order the segments of S(-P) that intersect 
the ray p x x \p y ,U}. By nesting and monotonicity, the left endpoint of a segment reported 
latter has a smaller x-coordinate. We stop as soon as reaching a segment whose left endpoint 
falls out of Q. The cost is 0(1 + k/B) because Q(B) segments are reported in each accessed 
leaf, except possibly the last one. 

We now elaborate the structure of Lemma Besides T, also create a structure of Lemma S] 
on P. Moreover, for every point p G P, keep a pointer to the leaf node that (i) is in the snapshot 
B-tree T{t) when I is at x = p x , and (ii) contains y p . Store the pointers in an array of size n to 
permit retrieving the pointer of any point in one I/O. The query algorithm is straightforward from 
the previous analysis, and performs O(l) I/Os. □ 

We will refer to the structure of Lemma [5] as a few-point structure. 
3.3 Final Top-Open Structure 

This subsection proposes our top-open structure that works for arbitrary n. For simplicity, we will 
discuss first the rank space, i.e., the universe is [U] 2 where U = 0(n). Our results will be extended 
to general U at the end of the section. 

Structure. A part of our solution externalizes a pointer-machine structure in [7]. That structure, 
however, has logarithmic query time. Hence, extra ideas are needed to eliminate the logarithmic 
factor. Below we show how to achieve this with Lemma [3 

We assume, without loss of generality, that both A = B log 2 U and U / A are integers. Divide 
the x-dimension of [U] 2 into U /A consecutive intervals of length A each. Call each interval a chunk. 
Assign each point p E P to the unique chunk covering x p . Note that some chunks may be empty. 

Create a complete binary search tree 7 on the chunks. Let u be a node of 7. Denote by P(u) 
the set of points (assigned to the chunks) in the subtree of u. Define high(u) as the set of B highest 
points in the skyline of P(u). If the skyline of P(u) has less than B points, high(u) includes all 
of them. Furthermore, if \high(u)\ = B, let highend(u) be the lowest point in high(u); otherwise, 
highend(u) = NULL. We store high(u) along with u. Also, if highend{u) ^ NULL, we record in u 
a pointer the chunk covering the x-coordinate of highend(u)$ 

2 The pointer is not needed in the rank space. Later, the same structure as described here will be deployed in a 
universe [U] 2 of general U, where the pointer will be useful. 
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Consider an internal node u such that p = highend{u) is not NULL. In this case, let ir(u) be 
the path from the leaf (a.k.a. chunk) z of 7 covering x p to the child of u that is an ancestor of 
z. Define as the set of right siblings of the nodes in tt(u) (note: if a node is the right child 

of its parent, it has no right sibling; similarly, if a node is a left child, it has no left sibling). Let 
MAX(u) be the skyline of the point set 

|J high(v). 

u&n 7 (u) 

We store MAX(u) along with u, where the points are ordered by x-coordinates (hence, also by 
y-coordinates). 

The above design completes the externalization of the structure in [7]. Next, we describe new 
mechanism for obtaining 0(1 + k/B) query time. First, we index the points in each chunk z with 
a few-point structure of Lemma [5j Moreover, for every proper ancestor u of z in T, we store two 
sets LMAX(z,u) and RMAX(z,u) defined as follows. Abusing notation slightly, let n(z,u) be the 
path from z to the child of u that is an ancestor of z. Also, define Tle(z, u) as the set of left siblings 
of the nodes on ir(z, u), and conversely, H^(z,u) the set of right siblings of those nodes. Then, 
LMAX (z,u) is the skyline of the point set: 

|J high(v), 

whereas RMAX (z,u) is defined symmetrically by replacing Hg(z, u) with TLy(z,u). The points of 
both LMAX (z,u) and RMAX (z,u) are sorted by x-coordinates. 

Space. Let h = 0(logU) be the height of 7. We analyze first the space on the 0(U/X) internal 
nodes u of 7. First, high(u) fits in O(l) blocks. Second, as MAX(u) has 0(hB) points, it occupies 
0(h) blocks. All the internal nodes thus occupy 0(h-(U/\)) = 0{U/B) = 0(n/B) blocks in total. 

Now, let us focus on the 0(U/\) leaf nodes z of 7. As each few-point structure uses linear 
space, all the few-point structures demand 0(U / 'A + nj B) = 0(n/B) blocks altogether. Regarding 
LMAX(z,u), z has at most h proper ancestors u, while each LMAX(z,u) requires 0(h) blocks. 
Hence, the LMAX(z,u) of all z and u occupy 0((U/X) ■ h 2 ) = 0(n/B) blocks in total. The case 
with RMAX(z,u) is symmetric. The overall space consumption is therefore linear. 

Query. We will need the following fact: 

Lemma 6. Given a node u in 7 and a value (3, we can report the k points in the skyline of P(u,f3) 
in 0(1 + k/B) I/Os, where P(u,/3) is the set of points in P(u) with y-coordinates greater than j3. 
The points are reported in the order of y-coordinates. 

Consider a top-open query with Q = [01,02] x \fl,U], where 01,02,/? G [U]- To answer it, 
we first identify the trunks z\ and Z2 that cover oi and 02, respectively. This takes 0(1) time 
by dividing oi and 02 by the chunk size A, respectively. If z\ = 22 > the query can be solved by 
searching the few-point structure of z\ in 0(1 + k/B) I/Os (Lemma EJ). Next, we focus on Z\ ^ Z2- 

Let u be the lowest common ancestor of z\ and Z2 in T. As T is a complete binary tree, u can 
be determined in constant time. The rest of the algorithm proceeds in 4 steps: 

Step 1. Use the few-point structure of Z2 to report the skyline of P(z2) C\Q. Let S(z2) be the 
set of points retrieved, and /3* the maximum y-coordinate of the points in S(z2). If S(z2) = 0, 
/3* = /3. 
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Step 2. Report the set S2 of points in LMAX(z2, u) whose y-coordinates are above j3*. Denote 
by Vi, v c the nodes of ILj(z2, u) in descending levels for some integer c. For each i € [1, c], check 
whether high(vi)nS2 has B points. If not, the subtree of Vi can be eliminated. Otherwise, apply 
Lemma [6] to retrieve the skyline of P(vi,f3i), where /3j is the maximum y-coordinate of the points 
in 52 that are lower than highend(vi). If no such point exists, Pi = /3*. If S2 7^ 0, update (3* to be 
the y-coordinate of the highest point in S^- 

Step 3. Find the set S\ of points in RMAX(zi,u) whose y-coordinates exceed (3*. Denote by 
v[,...,v' c , the nodes of II 7 (zi,u) in descending levels for some integer c' . For each i £ [1, d], if 
\high(v[) n Si\ = B, apply Lemma [6] to retrieve the skyline of P(i>-,/3-), where /3 t ' is the maximum 
y-coordinate of the points in S\ that are lower than highend(v'/) (if no such point, = /?*). If 
S± 7^ 0, set /3* to the y-coordinate of the highest point in S±. 

Step 4. Fetch the skyline of P(zi) n[ai, 02] x [f3*,U] from the few-point structure of z\. 

To analyze the cost, we focus on the first two steps because the other steps are symmetric. By 
Lemma[5l Step 1 takes 0(1 + k! / 1 B) I/Os, where k' is the number of points reported in this step. In 
Step 2, by leveraging the ordering inside LMAX(z2,u), S2 can be found in 0(1 + \S2\/B) I/Os. We 
charge the second term on the points of S^- Note that the points in S2 are sorted by y-coordinates, 
thanks to the ordering in LMAX{z2,u). For each i £ [1, c], if high(vi) n S2 has less than B point^l, 
the subtree of Vi incurs no more cost. Otherwise, the application of Lemma[6]takes 0{k' i / B) I/Os 
if the application finds k[ points (note that k[> B since the whole high(vi) is definitely reported). 
We charge the cost on those k[ points. Overall, every reported point is charged 0(1/ B) I/Os. Step 
1-4 each necessitate 0(1) extra I/Os. The total query cost is therefore 0(1 + k/B). 

As in the proof of Lemma [61 it is easy to modify the above algorithm to output the points in 
the order of y-coordinates. We thus have arrived at: 

Theorem 2. There is a linear-size structure on n points in the rank space such that a top-open 
range skyline query can be answered optimally in 0(1 + k/B) I/Os, where k is the number of 
reported points. The points of the query result are reported in the order of y-coordinates. 

General U . The above solution is for the rank space. For general U, we can extend our solution 
to obtain: 

Corollary 1. There is a linear-size structure on a set of n points in [U] 2 (where U > n is an 
integer) such that a top-open range skyline query can be answered in 0(loglog^C7 + k/B) I/Os, 
where k is the number of reported points. The points of the query result are reported in the order 
of y-coordinates. The query cost is optimal even without the indivisibility assumption. 

Proof. We divide the points of P into n/X disjoint subsets by their x-coordinates, where each subset 
has exactly A points, except possibly the last subset. Refer to each subset as a chunk. As before, 
build a complete binary search tree T on the set of trunks. All the secondary structures are exactly 
the same as described previously. 

To answer a top-open query with Q = [a±, 02] x [/3, U], we can also apply the same algorithm as 
presented earlier. The only difference is that the leaf nodes z\ and Z2 of T can no longer be obtained 
in constant time. Instead, we find them by looking for the predecessors of a.\ and a.2 respectively, 
among the starting x-coordinates of all the chunks. It is well-known that a predecessor query on n 
values in [U] can be answered in 0(loglog B U) I/Os using a structure of 0(n/B) blocks. 

3 This can be checked efficiently because the points of high(vi) (if any) are consecutive in S2. 
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The optimality of the query time follows directly from the reduction explained in the proof of 
Theorem [1] and the 0(loglog B U) lower bound of predecessor search under the linear space budget 
[30]. □ 



4 General Range Skyline Queries 

This section will move away from top-open queries. It would be nice if there was a linear-size 
structure that answered a bottom-open query in 0(log B n + k/B) I/Os. Unfortunately, this is 
impossible. In fact, this is impossible even for anti-dominance queries, which turn out to be as 
hard as general (4-sided) range skyline queries when linear space is compulsory. Next, we will 
formally establish these facts. 

A lower bound. We will first present a hardness result for anti-dominance queries. As mentioned 
in Section 11.21 some progress has been made in the internal-memory pointer machine. Kejlberg- 
Rasmussen et al. [23] proved: 

Lemma 7 (|23j). For any integer d > 1 and A > 1, there is a set P of d x points in M. 2 and a set 
G of Xd x_1 anti- dominance queries such that (i) each query in G retrieves d points of P, and (ii) 
at most one point in P is returned by two queries in G simultaneously. 

We use the term (d, X) -input to refer to the set P obtained in the above lemma after d and A 
have been fixed. We deploy such input sets to derive: 

Theorem 3. Regarding anti- dominance queries on n points, any indivisible structure of at most 
cn/B blocks must incur ^((n/B) 1 ^ 25 ^ + k/B) I/Os answering a query in the worst case, where 
c > 1 is a constant and k is the result size. 

Proof. Let us first review the indexability theorem of [T7]. Let A be a structure on a (d, A)-input. 
Define the access overhead of A as the smallest value A that allows us to claim: A answers any 
query with output size d in Ad/B I/Os. In the context of Lemma the indexability theorem 
states: 

if d > y an d A < A must use at least ^>4r blocks. 

Next, we will argue that if a structure has query complexity 0((n/-B) 1 '( 25c ^ +k/B), it must use 
strictly more than cn/B blocks in the worst case. This implies that no structure of at most cn/B 
blocks can guarantee the aforementioned query time, and hence, proving Theorem 

Consider any structure with query time 0((n/ B) 1 ^ 2 ^ + k/B). Let A be the structure's instance 
on a (d, A)-input where d = B and A = 12c + 1.1. The I/O cost of A answering a query with output 
size k = d is at most 

a((rf A /B) 1 /( 25c ) + d/B) 

12c+0.1 12, CK1_ 12.1 

= aB 25c + a = aB 2s^~ 25c + a < aB is + a 

where a > is a certain constant. It thus follows that A < aB^r -\- a < y/B/4 when B is 
sufficiently large. Therefore, by the indexability theorem, the structure must occupy at least 
(X/12)d x /B = (c + l.l/12)n/B > cn/B blocks. □ 

We thus have the following cleaner result: 
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Corollary 2. Regarding anti- dominance queries on n points, any indivisible linear-size structure 
must incur Q((n/B) e + k/B) I/Os answering a query in the worst case, where e > is a constant 
and k is the result size. 

Note that the same bound obviously holds for any generalization of anti-dominance queries, for 
instance, 4-sided range skyline queries. 

Optimal structure. We now give an indivisible linear-size structure on a set P of n points in M? 
such that a 4-sided range skyline query can be answered in 0((n/B) e + k/B) I/Os, where e > 
can be any constant. The query performance is optimal according to Corollary [2j 

Store P in a B-tree T that indexes the points' x-coordinates. T has leaf capacity B and internal 
fanout / = (n/B) e /log B n. The height h of T is thus O (log f(n/B)) = 0(1). For each node u 
in T, let P(u) be the set of points in the subtree of u. We manage P(u) using a structure R(u) 
for answering right-open queries. By the symmetry of right- and top-open queries, R(u) can be 
implemented as a structure of Theorem [TJ The right-open structures of all nodes at the same level 
of T consume 0(n/B) space in total. As T has only constant levels, the total space cost is 0(n/B). 

Given a 4-sided query with search rectangle Q = [011,0:2] x [^1,^2], we find in 0(h(f / B)) = 
0((n/B) e ) I/Os the leaf nodes of T containing the successor and predecessor of ct\ and 02 respec- 
tively, among the x-coordinates indexed by T. If z\ = Z2, solve the query by loading the B points 
in z\ into memory using 0(1) I/Os. 

Consider now z\ 7^ Z2- Let tt\ (7^) be the path from the lowest common ancestor of Z\ and z<i 
to z\ (ziY Let S be the set of child nodes v of the internal nodes on 7Ti U-7T2 such that the x-interval 
of v is fully contained in [01,02] (note that every node in T corresponds to an x-interval tightly 
enclosing the x-coordinates in its subtree). There is a natural ordering of the nodes in S by their 
x-intervals. We can easily obtain S in this order using 0(h(f / B)) = 0((n/B) e ) I/Os. Note that 
\S\<hf = 0((n/BY/log B n). 

We will issue a 4-sided query for Z2, then a right-open query for each node in S, and finally 
a 4-sided query for z\. Specifically, we first find the skyline of P(z?) n Q. This takes constant 
I/Os because 2:2 has only B points. Let /3* be the maximum y-coordinate of the points in the 
aforementioned skyline. Next, we process the nodes v of S in the right-to-left order of their x- 
intervals. For each v, perform a right-open query with (—00,00) x [/3*,/3 2 ] on R(v), and output all 
the points retrieved. If the query returned at least one point, update /3* to be the y-coordinate of 
the highest point returned. Finally, issue a 4-sided query with [a±, 02] x [/3*, fa] on z\ in 0(1) I/Os. 

Since each right-open query on the nodes of S costs 0(log B n) I/Os (plus linear output time), 
all such queries incur in total 0(\S\ log B n + k/B) = 0((n / B) e + k / B) I/Os. By the order we issued 
those queries and the output order of each query (guaranteed by Theorem [T]) , it is clear that the 
above algorithm can generate the result points in the order of y-coordinates in 0(k/B) extra I/Os. 
We thus conclude: 

Theorem 4. There is an indivisible linear-size structure on n points in ]R 2 such that, a 4-sided 
range skyline query can be answered in 0((n/B) e + k/B) I/Os, where k is the number of reported 
points. The points of the query result are reported in the order of y-coordinates. The query time is 
optimal under the indivisibility assumption. 
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Appendix 1: Proof of Lemma i 

Our argument is by induction on the position of I. For this purpose, care must be taken to interpret 
the rectangles of the nodes currently in T(£). As these nodes have not been version copied yet, the 
right edges of their rectangles lie on I. As £ moves, so do these right edges, until the corresponding 
nodes are version copied. Let set Si(£) include the bottom edges of the rectangles of all leaf nodes 
already spawned so far, counting also the nodes in T(£). Remember that the right endpoints of those 
edges are not taken, as explained earlier. When we finish building all the leaves, Si(^) becomes 
the final Si. We will show that Si(^) is nesting and monotonic at all times. This is obviously true 
when £ is at x = — oo. 

Now, suppose that Si(£) is currently nesting and monotonic. We will prove that it remains so 
after the next update on T(£). This is trivial if the update does not cause any version copy, i.e., 
the first leaf node u of T{£) is not full yet. Consider instead that u is version copied to v! . At 
this point, r{u) is finalized. Because r(u) is the lowest among the rectangles of the nodes in T(£), 
its finalization cannot affect the nesting and monotonicity of Si(^). The version copy also creates 
r(it'). Note that the right edge of r(u) and the left edge of r(u') both lie on £. Hence, the bottom 
edge of r(u) is disjoint with that of r(u') (recall that the right endpoint of an edge is not taken). 
Furthermore, r(u') has the same y-interval as r(u), and a zero-length x-interval (i.e., the x-interval 
is a point). Therefore, if no split/merge follows, Si(£) is still nesting and monotonic. 

Next, consider that u' is split into u\ and u' 2 - In this case, r(u') disappears from Si(£), and is 
replaced by r(u' 1 ) and r(u 2 ), which are the bottom two among the rectangles of the nodes in T(£). 
Furthermore, both r(«' 1 ) and r(u' 2 ) have zero-length x-intervals that degenerate into a point at the 
position of £. It follows that Si(£) is still nesting and monotonic. 

It remains to discuss the case where v! needs to be merged with its sibling v. When this happens, 
the algorithm first version copies v to v' , which finalizes r(v). The x-interval of r(v) must contain 
that of r(u), which is consistent with nesting and monotonicity because r(v) is above r(u). The 
merge of v! and v' creates a node z, such that r(z) has zero-length x-interval. Note that r(z) is 
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currently the lowest of the rectangles of the nodes in T(£). It is clear that Si(^) remains nesting 
and monotonia 

Finally, z may still need to be split one more time, but this case can be analyzed in the same 
way as the split scenario mentioned earlier. We thus conclude the proof. 

Appendix 2: Proof of Lemma 1 

If it is a leaf, find the skyline of P(u, /3) by issuing a top-open query with search rectangle [-U, U] x 
[/3, U] on the few-point structure of u. The query time is 0(1 + k/B) by Lemma [5l 

The rest of the proof is the adaptation of a similar argument in [7] to external memory. If u is 
internal, we find the skyline of P(u, /3) as follows. Load high(u) into memory, and report the points 
therein with y-coordinates above /3. If there are less than B such points, we have found the entire 
skyline of P(u, 0), due to the definition of high(u). 

Suppose instead that the entire highiu) is reported. Let p = highend(u). It suffices to consider 
the points that 

- (i) are in the subtrees of the nodes in ILy(u), or 

- (ii) share the same trunk as, but are to the right of, p. 

Any other point of P (u) must be either in high(u) (which is already visited) or dominated by p. 

To find the skyline points in (i), we collect the set S of points in MAX{u) whose y-coordinates 
are above (3. The points in S need to be returned, but there can be more result points in (i). To 
extract them all, we need to explore the subtrees of certain nodes in II 7 (ti). Specifically, let v±, v c 
be the nodes in I1 7 (m) in ascending levels, where c is some integer. For each Vi, if Si = high(vi) n S 
has less than B points, the subtree of Vi can be pruned from further consideration. Otherwise (i.e., 
the whole high(vi) is in S), we recursively report the skyline of P(vi,f3i), where is the maximum 
y-coordinate of points in S that are lower than highend(vi). If no such point exists, = /3. 

The skyline points in (ii) can be retrieved with a top-open query on the few-point structure of 
the trunk z covering x p , where z can be reached in constant time following a pointer stored at u. 
Specifically, compute (3q to be the maximum y-coordinate of the points in S (if S / 0), or j3 (if 
S = 0). The top-open query for z has rectangle (x p , max-x(z)] x [/3q,U], where max-x(z) is the 
maximum x-coordinate covered by z. 

Now we analyze the query cost. If less than B points of high(u) are reported, the algorithm 
finishes with 1 I/O. Otherwise, the scan of MAX(u) takes 0{l + \S\/B) I/Os. If |5| < B, we charge 
the 0(1 + \S\/B) = 0(1) cost on the B points in high(u); otherwise, we charge the 0(\S\/B) cost 
on the points of S. The top-open query on the few-point structure of z requires 0(1 + k'/B) I/Os 
if it returns k' points. If k' < B, we charge the 0(1 + k'/B) = O(l) cost on the points of high(u)\ 
otherwise, charge the 0(k' /B) I/Os on the k' points. 

It remains to discuss the I/Os spent on v±,...,v c . For each i £ [l,c], if \Si\ < B, there is no 
cost on Uj. Otherwise, we charge to the points of Si the 0(1) I/Os needed to read (i) high(vi) and 
(ii) the first block of MAX(vi). Overall, every reported point is charged 0(1/1?) I/Os. The total 
query time is therefore 0(1 + k/B). 

A final remark concerns the output ordering, which has been ignored by the above algorithm 
in order to keep the presentation simple. However, it is easy to modify the algorithm to ensure the 
desired ordering. First, the points of high(u) are clearly the highest in the result, and hence, are 
reported first. Consider the moment when we have obtained S±,...,S C (recall that their union is 
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S). For each i S [2, c], we do not report Si until all the points from the subtree of have been 
output. This means that we report Si only after (i) if < B, 5j_i has been output, or (ii) 

otherwise, the skyline of P(yi-i,/3i) has been output. 



21 



